Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[23.1] Fix metadata setting in extended metadata + outputs_to_working_directory mode #16678

Merged
merged 10 commits into from
Sep 13, 2023

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Sep 12, 2023

            outputs = [
                Bunch(false_path=os.path.join(outputs_directory, os.path.basename(path)), real_path=path)
                for path in self.get_output_files(job_wrapper)
            ]

would always set the false_path attribute, even if that is not correct
because outputs_to_working_directory isn't activated and pulsar doesn't
need to do any staging (e.g. with

      default_file_action: copy
      # don't copy outputs, not needed.
      file_actions:
        paths:
          - path_types: output
            action: none

as done in embedded_pulsar_metadata_extended_job_conf.xml.

Fixing this allows us to eliminate the if object_store: guard to
setting the external filename, which in turn fixes the metadata value
setting if outputs_to_working_directory: true and metadata_strategy: extended is used.

We didn't catch this previously because of the outpus_to_working_dir typo ...

The actual, isolated fix is in 8c5ac57, but I noticed that use_metadata_binary is probably broken in pulsar, so 6210267 should also fix that.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

```
            outputs = [
                Bunch(false_path=os.path.join(outputs_directory, os.path.basename(path)), real_path=path)
                for path in self.get_output_files(job_wrapper)
            ]
```

would always set the false_path attribute, even if that is not correct
because outputs_to_working_directory isn't activated and pulsar doesn't
need to do any staging (e.g. with

```
      default_file_action: copy
      # don't copy outputs, not needed.
      file_actions:
        paths:
          - path_types: output
            action: none
```

as done in embedded_pulsar_metadata_extended_job_conf.xml.

Fixing this allows us to eliminate the `if object_store:` guard to
setting the external filename, which in turn fixes the metadata value
setting if `outputs_to_working_directory: true` and `metadata_strategy:
extended` is used.
We'll surely need that too, as well as a datatypes config file.
I'd assume this probably didn't fully work before and that we have no
tests ? It does seem a little hard to test and I found no traces
of tests.
@bgruening
Copy link
Member

Wow, cool!

@mvdbeek
Copy link
Member Author

mvdbeek commented Sep 12, 2023

Alright, so now the extended + outputs to working dir tests are broken ...

This looks quite weird, can we just get the outputs from the tool run
maybe ?
@mvdbeek mvdbeek force-pushed the fix_metadata_setting branch from 3b16a93 to 01d4d43 Compare September 12, 2023 17:53
@mvdbeek mvdbeek force-pushed the fix_metadata_setting branch from cddc58b to 3930c6a Compare September 12, 2023 21:46
@mvdbeek
Copy link
Member Author

mvdbeek commented Sep 13, 2023

Hmm, I'm not quite sure what's up with the converter tests ... it seems other PRs are passing this test, and locally this passes too (but I can only test with --biocontainers) ... I'm inclined to switch over to --biocontainers for better reproducibility. I assume the index worked fine and there's just some minor difference in the index file.

@mvdbeek
Copy link
Member Author

mvdbeek commented Sep 13, 2023

OK, the CONVERTER_Bam_Bai_0 failed at least in #16683 as well, IMO confirming that using containerized dependencies is a good idea here.

@mvdbeek mvdbeek merged commit a7caa73 into galaxyproject:release_23.1 Sep 13, 2023
@galaxyproject galaxyproject deleted a comment from github-actions bot Sep 13, 2023
@dannon dannon modified the milestones: 23.2, 23.1 Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants